Datasets for the Evaluation of Substitution-Tolerant Subgraph Isomorphism

نویسندگان

  • Pierre Héroux
  • Pierre Le Bodic
  • Sébastien Adam
چکیده

Due to their representative power, structural descriptions have gained a great interest in the community working on graphics recognition. Indeed, graph based representations have successful been used for isolated symbol recognition. New challenges in this research field have focused on symbol recognition, symbol spotting or symbol based indexing of technical drawing. When they are based on structural descriptions, these tasks can be expressed by means of a subgraph isomorphism search. Indeed, in consists in locating the instance of a pattern graph representing a symbol in a target graph representing the whole document image. However, there is a lack of publicly available datasets allowing to evaluate the performance of subgraph isomorphism approaches in presence of noisy data. In this paper, we present three datasets that can be used to evaluate the performance of algorithms on several tasks involving subgraph isomorphism. Two of these datasets have been synthetically generated and allow to evaluate the search of a single instance of the pattern with or without perturbed labels. The third dataset corresponds to the structural description of architectural plans and allows to evaluate the search of multiple occurrences of the pattern. These datasets are made available for download. We also propose several measures to qualify each of the tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GEM++: A Tool for Solving Substitution-Tolerant Subgraph Isomorphism

The substitution-tolerant subgraph isomorphism is a particular error-tolerant subgraph matching that allows label substitutions for both vertices and edges. Such a matching is often required in pattern recognition applications since graphs extracted from images are generally labeled with features vectors computed from raw data which are naturally subject to noise. This paper describes an extend...

متن کامل

Efficiency Improvements for Parallel Subgraph Miners

Algorithms for finding frequent and/or interesting subgraphs in a single large graph scenario are computationally intensive because of the graph isomorphism and the subgraph isomorphism problem. These problems are compounded by the size of most real-world datasets which have sizes in the order of 10 or 10. The SUBDUE algorithm developed by Cook and Holder finds the most compressing subgraph in ...

متن کامل

Exploiting Vertex Relationships in Speeding up Subgraph Isomorphism over Large Graphs

Subgraph Isomorphism is a fundamental problem in graph data processing. Most existing subgraph isomorphism algorithms are based on a backtracking framework which computes the solutions by incrementally matching all query vertices to candidate data vertices. However, we observe that extensive duplicate computation exists in these algorithms, and such duplicate computation can be avoided by explo...

متن کامل

GraphCache: A Caching System for Graph Queries

Graph query processing is essential for graph analytics, but can be very time-consuming as it entails the NP-Complete problem of subgraph isomorphism. Traditionally, caching plays a key role in expediting query processing. We thus put forth GraphCache (GC), the first full-fledged caching system for general subgraph/supergraph queries. We contribute the overall system architecture and implementa...

متن کامل

Graph matching: filtering databases of graphs using machine learning techniques

Graphs are a powerful concept useful for various tasks in science and engineering. In applications such as pattern recognition and information retrieval, object similarity is an important issue. If graphs are used for object representation, then the problem of determining the similarity of objects turns into the problem of graph matching. Some of the most common graph matching paradigms include...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013